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Listing of Claims: 
Claims 1-23 (Cancelled) 

24. (Previously Presented) A method for retrieving information using a search engine, the 
method comprising: 

retrieving a document to be indexed; 

generating a virtual document based on the retrieved document, the virtual document 
comprising a portion of the retrieved document that characterizes an overall content of the 
retrieved document and being used to index the retrieved document; 

decomposing the virtual document into a plurality of tokens; and 
storing the plurality of tokens in a search index, wherein the search engine accesses the 
search index to identify one or more virtual documents that satisfy a search query and retrieves 
one or more documents corresponding to the one or more virtual documents. 

25. (Previously Presented) The method of claim 24, further comprising: 

recording position information relating to the portion of the retrieved document that 
characterizes the overall content of the retrieved document. 

26. (Previously Presented) The method ofclaim 25, further comprising: 

storing the recorded positional information with the plurality of tokens in the search 

index. 
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27. (Previously Presented) The method of claim 24, wherein the portion of the retrieved 
document that characterizes the overall content of the retrieved document is a summary of the 
retrieved document, 

28. ( Previously Presented) The method of claim 24, wherein generating the virtual document 
based on the retrieved document comprises: 

extracting from the retrieved document a collection of words, features, whole sentences, 
or parts of sentences that characterizes the overall content of the retrieved document. 

29. (Previously Presented) The method of claim 28. wherein extraction of the collection of 
words, features, whole sentences, or parts of sentences is based on frequency of occurrence, 
proximity to the beginning or end of a paragraph, proximity to the beginning or end of the 
retrieved document, or position within a certain document structure in the retrieved document. 

30. (Previously Presented) The method of claim 24, wherein each of the plurality of tokens 
comprises a word, a feature, a whole sentence, or a part of a sentence in the virtual document. 

3 1 . (Previously Presented) The method of claim 24, wherein the retrieved document is a 
web-page. 
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32. (Previously Presented) A computer readable medium containing a computer program for 
retrieving information using a search engine, the computer program comprising program 
instructions Tor: 

retrieving a document to be indexed; 

generating a virtual document based on the retrieved document, the virtual document 
comprising a portion of the retrieved document that characterizes an overall content of the 
retrieved document and being used to index the retrieved document; 

decomposing the virtual document into a plurality of tokens; and 
storing the plurality of tokens in a search index, wherein the search engine accesses the 
search index to identify one or more virtual documents that satisfy a search query and retrieves 
one or more documents corresponding to the one or more virtual documents. 

33. (Previously Presented) The computer readable medium orciaim 32, wherein the 
computer program further comprises program instructions for: 

recording position information relating to the portion of the retrieved document that 
characterizes the overall content of the retrieved document. 

34. (Previously Presented) The computer readable medium of claim 33, wherein the 
computer program further comprises program instructions for: 

storing the recorded positional information with the plurality of tokens in the search 

index. 
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35. (Previously Presented) The computer readable medium of claim 32, wherein ihe portion 
of the retrieved document that characterizes the overall content of the retrieved document is a 
summary of Ihe retrieved document, 

36. (Previously Presented) The computer readable medium of claim 32, wherein generating 
the virtual document based on the retrieved document comprises: 

extracting from the retrieved document a collection of words, features, whole sentences, 
or parts of sentences that characterizes the overall content of the retrieved document. 

37. (Previously Presented) The computer readable medium of claim 36, wherein extraction 
of the collection of words, features, whole sentences, or parts of sentences is based on frequency 
of occurrence, proximity to the beginning or end of a paragraph, proximity to the beginning or 
end of the retrieved document, or position within a certain document structure in the retrieved 
document. 

38. (Previously Presented) The computer readable medium of claim 32, wherein each of the 
plurality of tokens comprises a word, a feature, a whole sentence, or a part of a sentence in the 
virtual document 

39. (Previously Presented) The computer readable medium of claim 32, wherein the 
retrieved document is a web-pagc. 
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40. (Previously Presented) A system for retrieving information using a search engine, the 
system comprising: 

a crawler for retrieving a document to be indexed; 

an extractor coupled to the crawler for generating a virtual document based on the 
retrieved document, the virtual document comprising a portion of the retrieved document thai 
characterizes an overall content of the retrieved document and being used to index the retrieved 
document; 

a storage device coupled to the extractor for storing the virtual document; 

an indexer coupled to the storage device for decomposing the virtual document into a 
plurality of tokens; and 

a search index coupled to the indexer for storing the plurality of tokens, wherein the 
search engine accesses the search index to identify one or more virtual documents that satisfy a 
search query and retrieves one or more documents corresponding to the one or more virtual 
documents, 

41 . (Previously Presented) The system of claim 40, wherein the extractor records position 
information relating to the portion of the retrieved document that characterizes the overall 
content of the retrieved document. 

42. (Previously Presented) The system of claim 4! , wherein the search index stores the 
recorded positional information with the plurality of tokens. 
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43. (Previously Presented) The system of claim 40, wherein the portion of the retrieved 
document that characterizes the overall content of the retrieved document is a summary of the 
retrieved document. 

44. (Previously Presented) The system of claim 40, wherein generating the virtual document 
based on the retrieved document comprises: 

extracting from. the retrieved document a collection of words, features, whole sentences, 
or parts ofsentences that characterizes the overall content of the retrieved document. 

45. (Previously Presented) The system of claim 44, wherein extraction of the collection of 
words, features, whole sentences, or parts of sentences is based on frequency of occurrence, 
proximity to the beginning or end ofa paragraph, proximity to the beginning or end of the 
retrieved document, or position within a certain document structure in the retrieved document. 

46. (Previously Presented) The system of claim 40, wherein each of the plurality of tokens 
comprises a word, a feature, a whole sentence, or a part, of a sentence in the virtual document. 

47. (Previously Presented) The system of claim 40, wherein the retrieved document is a wei> 
page. 



PAGE 9/13 1 RCVD AT 3/24/2006 2:29:12 PM [Eastern Standard Time] 1 SVR:USPTO«EFXRF-6/37 * DNIS;27W300 t CS1D: ' DURATION (mm-ss):03-06 



